376 results found.
Speech
Corpus,
Language Type:
Multilingual
Languages:
Dari/Pashto Dutch English Finnish French Hindi Icelandic Indonesian Japanese Lithuanian Malay Mandarin Nepali Portuguese Punjabi Romanian Slovenian Spanish
Availability:
From Owner
License:
CreativeCommons
Size:
467 hours Production Status:
Newly created-finished
Use:
Person Identification
-
Paper title:JukeBox: A Multilingual Singer Recognition Dataset
-
Paper track:4.3 Speaker verification and identification/Oral Presentation
-
Paper status:Accept - Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Anurag Chowdhury | JukeBox | /N |
Documentation:
Documentation in English language will be made available upon publication of the dataset.
Speech, noise and room impulse response data,
Language Type:
Multilingual
Languages:
English German Spanish
Availability:
via request, maybe public in the future?
License:
Creative Commons Attribution-NonCommercial 4.0 International Public License
Size:
2.3Gbyte OtherProduction Status:
released
Use:
Development of speech-enhancement algorithms
-
Paper title:Optimization and evaluation of an intelligibility-improving signal processing approach (IISPA) for the Hurricane Challenge 2.0 with FADE
-
Paper track:13.4 Intelligibility-enhancing Speech Modification/Oral Presentation
-
Paper status:Accept Special Session
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Marc René Schädler | Hurricane Challenge 2.0 development set | /N |
Documentation:
Yes, english
Written
Terminology,
Language Type:
Multilingual
Languages:
Arabic Dutch English French German Modern Greek Russian Spanish
Availability:
Freely Available
License:
Size:
4473 concepts Production Status:
Existing-updated
Use:
Acquisition
-
Paper title:Representing Multiword Term Variation in a Terminological Knowledge Base: a Corpus-Based Study
-
Paper track:Terminology/oral presentation
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Pilar León-Araúz | EcoLexicon | /N |
Documentation:
https://ecolexicon.ugr.es/en/manual.htm
Written
Corpus,
Language Type:
Monolingual
Languages:
Spanish
Availability:
Freely Available
License:
Size:
206,937 tokens Production Status:
Newly created-in progress
Use:
Corpus Creation/Annotation
-
Paper title:A Corpus of Spanish Political Speeches from 1937 to 2019
-
Paper track:Written/poster presentation
-
Paper status:Accept Poster+DemoSuggested
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Elena Alvarez-Mellado | A corpus of Spanish political speeches from 1937 to 2019 | /N |
Documentation:
https://github.com/lirondos/discursos-de-navidad
Written
Corpus,
Language Type:
Multilingual
Languages:
Chinese English French German Japanese Korean Russian Spanish
Availability:
Freely Available
License:
CC-BY-4
Size:
68000000 sentences Production Status:
Newly created-finished
Use:
Machine Translation, SpeechToSpeech Translation
-
Paper title:ParaPat: The Multi-Million Sentences Parallel Corpus of Patents Abstracts
-
Paper track:Written/oral presentation
-
Paper status:Accept Oral
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Felipe Soares | ParaPat | /N |
Documentation:
None
Written
Corpus,
Language Type:
Multilingual
Languages:
English French German Italian Spanish
Availability:
Freely Available
License:
Creative Commons
Size:
59364453 sentences Production Status:
Newly created-finished
Use:
Word Sense Disambiguation
-
Paper title:Sense-Annotated Corpora for Word Sense Disambiguation in Multiple Languages and Domains
-
Paper track:Written/poster presentation
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Bianca Scarlini | OneSeC | /N |
Documentation:
None
Speech/Written
Corpus,
Language Type:
Multilingual
Languages:
Chinese Dutch French German Italian Mongolian Persian Russian Spanish Swedish Turkish
Availability:
Freely Available
License:
CC0
Size:
700 hours Production Status:
Newly created-finished
Use:
Machine Translation, SpeechToSpeech Translation
-
Paper title:CoVoST: A Diverse Multilingual Speech-To-Text Translation Corpus
-
Paper track:Speech/oral presentation
-
Paper status:Accept Oral
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Changhan Wang | CoVoST | /N |
Documentation:
https://github.com/facebookresearch/covost
Speech/Written
Corpus,
Language Type:
Bilingual
Languages:
Mapudungun Spanish
Availability:
Freely Available
License:
Attribution-NonCommercial 3.0 United States (CC BY-NC 3.0 US)
Size:
142 hours Production Status:
Existing-updated
Use:
-
Paper title:A Resource for Computational Experiments on Mapudungun
-
Paper track:Speech/oral presentation
-
Paper status:Accept Oral
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Antonios Anastasopoulos | Mapudungun Corpus | /N |
Documentation:
in English
Written
Corpus,
Language Type:
Bilingual
Languages:
Basque Spanish
Availability:
From Data Center(s)
License:
CC-BY-NC-SA
Size:
637183 sentences Production Status:
Newly created-finished
Use:
Machine Translation, SpeechToSpeech Translation
-
Paper title:Handle with Care: A Case Study in Comparable Corpora Exploitation for Neural Machine Translation
-
Paper track:Evaluation/oral presentation
-
Paper status:Accept Oral
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Thierry Etchegoyhen | Basque-Spanish EiTB corpus of aligned comparable sentences V2 | /N |
Documentation:
None
Written
Lexicon,
Language Type:
Multilingual
Languages:
Albanian Arabic Basque Bulgarian Catalan Chinese Croatian Danish Dutch English Finnish French Galician Greek Hebrew Icelandic Indonesian Italian Japanese Lithuanian Malay Norwegian Persian Polish Portuguese Romanian Slovak Slovene Spanish Swedish Thai
Availability:
Freely Available
License:
Multiple Licenses
Size:
1072646 synsets Production Status:
Existing-used
Use:
All of the above
-
Paper title:Some Issues with Building a Multilingual Wordnet
-
Paper track:Infrastructural Issues/Large Projects/oral presentation
-
Paper status:Accept Oral
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | John P. McCrae | Open Multilingual WordNet | /N |
Documentation:
None




